Speech presence detector



May 19, 1970 G. A. HELLWARTH 4 SPEECH PRESENCE DETECTOR 'Filed oct. 15, 1967 3 Sheets-Sheet 1 ALBERT C. RUOCCHIO /6 (9 ATTORNEY May 19, 1970 Filed Oct. 13, 1967 FIG. 3A

G. A. HELLWARTH SPEECH PRESENCE DETECTOR 3 Sheets-Sheet 2 May 19, 1970 Filed Oct. l5, 1967 G. A. H ELLWARTH SPEECH PRESENCE DETECTOR 3 Sheets-Sheet 5 FIG. 3B

United States Patent O M' 3,513,260 SPEECH PRESENCE DETECTOR George A. Hellwarth, Gardner D. *.Iones, and Albert C. Ruocchio, Raleigh, N.C., assignors to. International Business Machines Corporation, Armonk, N.Y., a corporation of New York Filed Oct. 13, 1967, Ser. No. 675,230I Int. Cl. Gl 1/04 U.S. Cl. 179-1 3 Claims ABSTRACT OF THE DISCLOSURE FIELD oF INVENTION- v This invention relates to speech processing in general and more particularly to speech priesencedetectors for indicating the beginnings4 and continuances of speech signals.

@Riou ART Speech presence detection is us`e'ful`in many speech processingareas. Onearea where itfhasfound'extensive use is in recording speech for use in data processing systems where limited storage requires the elimination of many and prolonged periods of silence'- which occur. in natural speech. The use to which the recordedl vspeech is put is immaterial to this application-and` it may be utilized for one ory more purposes. vv v Several techniques have been employed lin the prior art for detecting the beginning and continuance of speech. In one prior art -systerr 1,'a detectory was Vutilizednto derive the envelopev of the incoming signal and this envelope was passed through a low pass filter to determine the presence of speech. The noise envelope was generally D.C. and contributed nothing to the filter output. This` system has one serious drawback. The use of suitableV linear filtering resulted in a time'- delay-in detecting; the onset'of speech thereby distorting the initial portionof the signal. This problem is overcome through the` use of analog delay and the employment of a parallel signal path such that signal detection can bev made to Vcoincide with the start of the delayed signal. The solutionof this problem has not however, been considered generally successful due to its cost. t ,v

Another prior art speech presence detector works satisfactorily provided the noise and speech signal levels remain within predetermined narrow limits. Therefore, such detectors are unsuitable for use in conjunction with telephone distribution system since the Vsignal to noise ratio in the average telephone network may vary from 13 to 40 db and this amount of variation exceeds the satisfactory operating range ofthe circuit.

SUMMARY OF INVENTION One object offthis invention is to provide a speech presence detector which is fast in operation so as. to prevent the loss of the beginning of a speech signal.

Another object of the invention is to provide a speech presence detector which4 is capable of operation in'an environment having a large variation in signal to noise ratio. Y I

A further object of the invention is to provide a speech 3,513,260 Patented May 19, 1970 ICC presence detector which is not adversely affected by large long-term variations in the noise level.

Yet another object of the invention is to provide a speech presence detector which is capable of operation in the presence of wide variations in the amplitude of the speech signal which is to be detected.

The invention contemplates a speech presence detector suitable for connection to a telephone speech distribution system and comprises a peak detecting means having a time constant large enough to convert periodic signals having a frequency above the basic speech pitch frequency range and steady state noise to direct current components and said time constant also being selected so as to follow initial pitch period variations in the speech signal, means for differentiating the peak detector output to accentuate the alternating current component and block the direct current component, and means responsive to the differentiated signal for indicating the presence of speech in response thereto and for continuing said indications provided the period of the alternating component remains within a predetermined range.

BRIEF DESCRIPTION OF DRAWINGS DESCRIPTION OF THE PREFERRED EMBODIMENT In FIG. l an intermittent speech signal superimposed on steady state long-term amplitude variable background noise is applied to an input terminal 11. The speech signal is graphically illustrated in curve A of FIG. 3 and includes a plurality of pitch periods in succession followed by aperiod of science in which only the background noise is present at the input. The received signal is amplified in amplifier 12 and applied to a peak detecting circuit 1'4 which provides the output illustrated graphically in curve B of FIG. 3.

The output of peak detector 14 is applied to a pair of series connected differentiating circuits 15 which alter the input waveforms from peak detector 14 as illustrated in curves C and D of FIG. 3. The output of the second differentiating circuit illustrated in curve D of FIG. 3 comprises a plurality of sharply defined voltage spikes coinciding with the sharp voltage rise occurring at the beginning of each pitch period. The output from the @second stage of differentiating circuit 15 is applied to .periods following the termination of speech are rejected.

The length of the time period determines the time after an utterance before the detector indicates the absence of speech.

FIG. 2 shows the specific details of peak detector 14,

differentiating circuits 15 and timing and level detector 16. The output of amplifier 12 is applied to the cathode of a diode 20 which has its anode connected to the cornmon junction of a capacitor 21 and a resistor 22 which are both returned to ground. In addition, the common junction is connected to the base of amplifiyng transistor 23 which has its emitter current supplied by a positive voltage source E through a resistor 24 and its Collector connected directly to a negative voltage source E.

llnitially capacitor 21 is at ground potential and diode is nonconductive as long as the output of amplifier 12 is above ground. Upon the first negative excursion of the output of amplifier 12, capacitor 21 charges to the peak negative value and discharges via resistor 22. The time constant of resistor 22 and capacitor 21 is selected so that it is large enough to convert periodic signals having a frequency above the basic speech pitch frequency range and steady state noise to direct current components and simultaneously low enough to follow initial pitch period variations in the speech signals as shown in curve B of FIG. 3. Diode 20 remains reversed biased until capacitor 21 discharges to a point where its voltage applied to the anode of diode 20 is more positive than the input voltage on the cathode. This condition will occur generally at the onset of the next successive pitch period, however, voltage spikes within a pitch period may forward bias diode 20 to charge capacitor 21 as shown after the second and third pitch periods. The pulses thus generated will not however change operation as will be explained later. Minor variations in the noise level do not produce pulses large enough to trigger the detector, because of the particular time constant selected for 21 and 22.

The emitter of transistor 23 is connected to the base of an amplifying transistor 26 by a series connected capacitor 27 and a resistor 28 which differentiates the output of amplifier 23 to produce the wave form illustrated in curve C of FIG. 3. Series connected resistors 30, 31 and 32 connected between positive source E and negative source E provide collector and base bias potential for transistor 26 which has its emitter connected to ground.

The amplified output at the collector of transistor 26 is again differentiated by capacitor 34 and resistor 39. The twice differentiated pulses illustrated in curve D of FIG. 3 are applied to the base of transistor 37 and cause the transistor to conduct when they exceed a given amplitude determined by the base-emitter contact potential of transistor 37.

The collector of transistor 37 is connected by a current limiting resistor 40 to the common junction of a resistor 41 and capacitor 42 which are each connected to positive voltage source IE. Capacitor 42 will charge through resistor 41 to the voltage of source E. Each time transistor 37 conducts, i.e. driven by a sufficiently large pulse from differentiating circuit 15, capacitor 42 discharges via current limiting resistor 40 through transistor 37. In this manner, the maximum voltage at the junction of resistor 41 and capacitor 42 illustrated by curve E of IFIG. 3 is a function of the number of pulses above the threshold applied in a given time to the base of transistor 37.

The junction of resistor 41 and capacitor 42 is connected to the base of a transistor 44 which in conjunction with another transistor 45 comprises a comparator. The emitters of transistors 44 and 45 are connected to positive source :E by a resistor 46 and the collectors to negative source E by identical load resistors 47 and 48, respectively. A pair of resistors 50 and 51 connected between positive source E and ground provide a reference potential at their common junction which is connected to the base of transistor 45.

As long as the voltage at the common junction of resistor 41 and capacitor 42 remains below the reference potential applied to the base of transistor 45, transistor 44 conducts and transistors 45 is cut off. This condition causes the collector potential of transistor 45 to go to the negative potential of source E as indicated in curve FF of FIG. 3 While the collector of transistor 44 assumes a more positive potential determined by the voltage across resistor 47.

As soon as capacitor 42 charges to a voltage more positive than the reference voltage applied to the base of transistor 45, the voltage conditions at the collectors of transistors 44 and 45 reverse to indicate the termination of speech. The bi-polar outputs provided at the collectors of transistors 44 and 45 may be utilized for any purpose such as controlling the recording of the signal on the line in a storage medium to thus eliminate periods of silence to conserve storage.

While the invention has been particularly shown and described with reference to a preferred embodiment thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention.

What is claimed is:

1. A speech presence detector for rapidly detecting the beginning and continuance of speech superposed on noise subject to long term variations comprising,

peak detecting means having a time constant large enough to convert periodic signals having a frequency above the basic speech pitch frequency range and steady state noise to direct current components and low enough to follow the initial pitch period variations in the speech signal,

means responsive to the output of the peak detecting means for differentiating the output signal to accen` tuate the alternating current components and block the direct current components, and

means responsive to the differentiated signal for indicating the presence of speech as long as the alternating component recurs within a preselected time interval.

2. A speech presence detector as set forth in claim 1 in which said differentiating means includes,

a first differentiating circuit means responsive to the output of the peak detecting means, and

a second differentiating circuit means responsive to first differentiating circuit means for performing a second differentiation of the signal from the peak detecting means for further accentuating the alternating components of the received signal and blocking the direct current components received.

3. A speech presence detector as set forth in claim 1 in which the means responsive to the differentiated signal includes,

a two input comparator means,

a reference voltage source connected to one of the comparator inputs,

a capacitor,

a charging circuit connected to said capacitor for charging it to a voltage exceeding the reference voltage connected to the comparator,

means connecting the capacitor to the other input of the comparator which provides an output indicative of which input is greater, and

means responsive to the differentiated signal for discharging the capacitor in accordance with the accentuated alternating components.

References Cited UNITED STATES PATENTS 3,286,031 11/1966 Geddes 179-1 OTHER REFERENCES Ives, Music Pulse Analyzer, Electronics, Apr. 1, 1957, pp. 183-184.

KATHLEEN H. CLAFFY, Primary Examiner =D. W. OLMS, Assistant Examiner 

